REPORT  DOCUMENTATION  PAGE 


form  Approved 
0MB  No.  0704^0188 


PuOlic  reoorring  tsurden  ^or  thts  coHeaion  of  information  n  estimated  to  average  l  hour  oer  response,  including  tne  time  tor  reviewing  instructions,  searcning  existing  data  sources, 
gathering  and  maintaining  the  data  needed,  and  completing  and  reviewing  the  colleaion  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspea  of  this 
collection  of  information,  including  suggestions  tor  reducing  this  Durden,  to  i/vashington  Headduarters  Services,  Direaorate  for  information  Operations  and  Reports.  1215  Jefferson 
Davis  Highway.  Suite  1 204.  Arlington,  VA  22202-4302.  and  to  the  Office  of  Management  and  Budget.  Paperwork  Reduaion  Pro)ea  (0704-0 188).  Washington,  DC  20503. 


1.  AGENCY  USE  ONLY  (Leave  blank)  2.  REPORT  DATE 

25  March  1997 


3.  REPORT  TYPE  AND  DATES  COVERED  ^ 

tVVxvl  A^-5\ 


4.  TITLE  AND  SUBTITLE 

Models  of  Multiprocessor  Architectures  and 
Algorithms  (Final  Report) 


5.  FUNDING  NUMBERS 


AH  04-93-6-0^^9 


6.  AUTHOR(S) 


Eric  E.  Johnson 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADORESS(ES) 

New  Mexico  State  University 
Las  Cruces,  NM 


8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


9.  SPONSORING /MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 
U.S.  Army  Research  Office 
P.  0.  Box  12211 

Research  Triangle  Park,  NC  27709-2211 


10.  SPONSORING /MONITORING 
AGENCY  REPORT  NUMBER 

/V(2.0  30q53.S-EL-i-l 


11.  SUPPLEMENTARY  NOTES 

The  view,  opinions  and/or  findings  contained  in  this  report  are  those  of  the 
author (s)  and  should  not  be  construed  as  an  official  Department  of  the  Army 
position,  policy,  or  decision,  unless  so  designated  by  other  documentation. 


12a.  DISTRIBUTION  /  AVAILABILITY  STATEMENT 


12b.  DISTRIBUTION  CODE 


Approved  for  public  release;  distribution  unlimited. 


13,  ABSTRACT  (Maximum  200  words) 

A  key  requirement  for  the  effective  use  of  multiprocessors  in  real-world 
applications  is  an  ability  to  accurately  predict  the  performance  of  a 
specific  algorithm  on  a  specific  architecture.  This  research  developed 
such  a  prediction  methodology,  which  permits  separate  evaluation  of 
algorithm  and  architecture  performance,  with  only  a  small  number  of  cross 
parameters  required  to  link  the  models.  Additional  results  include 
program  behavior  models  that  lead  to  effective  trace  compression  techniques. 


20010301  UO 


14.  SUBJECT  TERMS 


IS.  NUMBER  OF  PAGES 


Computer  architecture,  performance  modeling,  performance  price  CODE 
prediction,  parallel  computing,  multiprocessor,  traces  I 


17.  SECURITY  CLASSIFICATION  18.  SECURITY  CLASSIFICATION  19.  SECURITY  CLASSIFICATION  20.  LIMITATION  OF  ABSTRACT 
OF  REPORT  OF  THIS  PAGE  OF  ABSTRACT 

UNCLASSIFIED  UNCLASSIFIED  UNCLASSIFIED  UL 


NSN  7540-01-280-5500 


Standard  Form  298  (Rev  2-89) 

Prescribed  by  4NS»  Std  239-^8 
298-102 


Models  of  Multiprocessor  Architectures  and  Algorithms 


Final  Report 


Eric  E.  Johnson 


25  March  1997 


U.S.  Army  Research  Office 


Grant  #  DAAH04-93-G-0229 


New  Mexico  State  University 


Approved  for  public  release;  distribution  unlimited. 


The  views,  opinions,  and  findings  contained  in  this  report  are  those  of  the  author  and  should  not  be  construed  as  an  official 
Department  of  the  Army  position,  policy,  or  decision,  unless  so  designated  by  other  documentation. 


Final  Report 


Models  of  Multiprocessor  Architectures  and  Algorithms 

Statement  of  the  Problem  Studied 


DAAH04-93-G-0229 


A  key  requirement  for  the  effective  use  of  multiprocessor  systems  in  real-world  applications  is  an  ability  to 
accurately  predict  the  performance  of  a  specific  algorithm  on  a  specific  architecture.  Such  performance 
prediction  tools  assist  the  system  designer  in  initially  selecting  suitable  algorithms  and  architectures,  and 
then  in  modifying  them  to  improve  performance. 

Existing  techniques  for  joint  performance  prediction  of  multiprocessor  algorithms  and  architectures 
generally  require  joint  evaluation  of  the  model  for  each  algorithm/architecture  pair  of  interest.  This  results 
in  significantly  more  computation  than  an  approach  that  models  algorithms  and  architectures  separately, 
with  joint  performance  computed  from  the  individual  performance  measures.  In  addition,  the  accuracy  of 
current  techniques  is  often  a^ected  by  assumptions  about  algorithm  and  architectural  behavior  for  which 
few  measurements  have  been  available. 

The  objective  of  this  research  was  to  extend  our  previous  work  in  modeling  multiprocessor  and 
distributed  system  performance  to  produce  an  efficient,  accurate  performance  prediction  methodology 
applicable  to  Army  systems.  This  technique  is  based  upon  measurements  of  important  applications,  and 
permits  separate  evaluation  of  algorithm  and  architecture  performance  with  only  a  small  number  of  “cross” 
parameters  required  to  link  the  two  models. 


Specific  Aims:  Measure  the  characteristics  of  parallel  computer  programs  of  the  types  used  in  Army  systems, 
validate  a  theoretical  model  of  parallel  computer  system  performance,  establish  a  data  base  of  parallel 
computing  performance  measurements  for  Internet  access. 

Findings:  Researchers  supported  by  this  contract  developed  the  following: 

1 .  An  efficient,  accurate  methodology  for  predicting  the  performance  of  algorithms  on  parallel 
architectures.  This  methodology  was  validated  both  in  our  own  work  and  by  outside  researchers. 

2.  A  parallel  algorithm  for  tracking  multiple  targets  in  video  imagery  that  is  portable  among  nearly  all 
parallel  architecture  classes,  and  that  scales  well  from  small  to  large  multiprocessors.  Apart  from  its 
direct  utility  in  Army  programs,  this  algorithm  is  a  useful  vehicle  for  validating  our  performance 
prediction  model  over  a  very  wide  range  of  parallel  machines. 

3.  A  portable  parallel  architecture  simulator  and  cache  simulator. 

4.  A  model  of  redundancy  in  instruction  and  data  traces  that  can  be  applied  to  reduce  trace  sizes  by  up 
to  two  orders  of  magnitude.  The  significance  of  the  later  result  is  evident  in  the  widespread  use  of 
our  PD  ATS  trace  format  in  the  computer  architecture  community. 

5.  A  database  of  uniprocessor  and  multiprocessor  traces  that  is  in  use  by  researchers  worldwide. 

6.  Hardware  to  capture  complete  and  filtered  traces  in  real  time  from  uniprocessors  and  multiprocessors. 

This  body  of  work  is  cited  internationally  in  online  references  used  by  the  research  community  (e.g.: 
http://www.hensa.ac.uk/parallel/simulation/architectures/pdats/index.html, 
http://www.hensa.ac.uk/parallel/acronyms/index.html, 
http://www.cs.newcastle.edu.au/Research/VRMG/index.html). 
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